Undergrad: Psychology & Applied Math at Texas A&M
Bioinformatics at Iowa State (started in 2009)
Transferred to Statistics in Fall 2010
Finished classes in Spring 2013
Sine Illusion
Visual Aptitude and Graphical Inference
Hierarchy of Graph Features
8-hour Average Ozone Levels in Houston, TX by temperature at Hobby Airport
Residual Ozone Levels in Houston, TX by temperature at Hobby Airport
The sine illusion results from misapplication of a three-dimensional visual heuristic to ambiguous two-dimensional images
The sine illusion results from misapplication of a three-dimensional visual heuristic to ambiguous two-dimensional images In this figure, the vanishing point has been moved towards infinity; the lines are straight and closer to the appearance of the sine illusion. The three-dimensional appearance is still intact.
We perceive the orthogonal width of the implied surface
The orthogonal width is a function of the x and y range as well as the aspect ratio of the plot.
The perceived orthogonal width is also a function of the slope of the line tangent to the underlying function curve.
It is hard to re-create this graphic with separate curves that still provide all of the information
Let \(a\) and \(b\) be the minimum and maximum of the \(x\)-range under consideration.
For any value \(x \in (a,b)\) the following transformation results in a function with constant absolute slope:
Shrinkage factor \(w \in (0,1)\): allows a less extreme approach to counteracting the illusion
\[(f \circ T_w)(x) = (1-w) \cdot x + w \cdot (f \circ T)(x)\]
If we extend the line length so that the extant width matches the original vertical length, our perceptions will match the original data.
The function describing the orthogonal line through \((x_o, f(x_o))\) is given in point-vector form as
\[ {x_o \choose f(x_o)} + \lambda {f^\prime(x_o) \choose 1} \]
for any real-valued \(\lambda\).
Point vector form allows us to solve for \(\lambda\) easily, giving the extant (half) widths as:
\[ |\lambda| \sqrt{1 + f^\prime(x_o)^2} \]
This equation describes the quantity that we perceive rather than the quantity that we want to display (\(\ell/2\))
The general correction factor is thus
\[ \ell/2 \cdot \left(|\lambda| \sqrt{1 + f^\prime(x_o)^2}\right)^{-1} \]
This yields two solutions; one for positive and one for negative values of \(\lambda\) corresponding to upper and lower (half) extant width.
In order to get actual numeric values for \(\lambda\), we need to find end points \(f_1\) and \(f_2\). This system of equations provides solutions for those points:
\[ x - x_o = \lambda f^\prime(x_o) \]
\[ f(x) - f(x_o) = -\lambda \pm \ell/2 \]
Solving these equations requires numerical optimization; we will use linear and quadratic taylor series to simplify the optimization processs.
Substituting the endpoints \((x_1, y_1)\) and \((x_2, y_2)\) into the general correction factor produces the linear and quadratic corrections to the sine illusion
\[f(x)\approx f(x_0) + (x-x_0) f^\prime(x_0)\]
The correction factor is then
\[\ell_{new}(x_0) = \ell_{old}\sqrt{1+f^\prime(x_0)^2}\]
\[f(x) \approx f(x_0) + f^\prime(x_0)(x-x_0) + 1/2 f^{\prime\prime}(x_0)(x-x_0)^2\]
The general correction system of equation simplifies to
\[
f^{\prime\prime}(x_0) f^\prime(x_0)^2 \lambda^2 + 2(f^\prime(x_0)^2 + 1) \lambda \pm \ell = 0,
\]
where \(v = 1 + f^\prime(x_0)^2\)
In the quadratic correction, each half-length is corrected separately, producing a more robust correction
The y-axis transformation can be weighted in the same manner as the x-axis transformation.
A Shiny applet was created to explore the x and y corrections.
Goal : Determine the strength of the Sine Illusion by measuring how much correction is required for viewers to say that the lines are of equal length.
A different Shiny applet was created to allow users to manipulate the stimuli using fine-grained adjustments to the weight value.
User identification information: a ‘fingerprint’ consisting of hashed browser and computer characteristics was used to identify unique users
IP address localization (34.45.38.XX) provided location information
Every user interaction was recorded with a timestamp
Trial finished when user clicked either ‘submit’ or ‘skip’ to opt-out of the trial.
Trial recorded at least two user interactions:
The user must adjust the weight value at least once and then click the submit button.
User completed at least 4 trials
User selected a weight value that was not severely over-corrected or under corrected (i.e. weight value selected was plausible)
Once exclusion criteria were applied, our data consisted of 125 participants who completed 1210 valid trials.
Let \(\gamma_X\) represent the optimal weight value for the \(X\)-correction
and \(\gamma_Y\) represent the optimal weight value for the \(Y\) correction.
\(\gamma_\ast = \frac{1}{2}(w_0 + w_1)\)
where \(w_0\) is the preferred weight when starting at 0, and \(w_1\) is the preferred weight when starting at 1.
\[
W_{ij} = \alpha_{T(i,j)} + \beta X_{ij} + \gamma_{i, T(i,j)} + \epsilon_{ij}\]
\[\gamma_{iX} \stackrel{\text{ i.i.d.}}{\sim} N(0, \eta_X^2) \ \ \ \ \ \ \ \ \gamma_{iY} \stackrel{\text{ i.i.d.}}{\sim} N(0, \eta_Y^2) \]
\[\epsilon_{ij} \stackrel{\text{ i.i.d.}}{\sim} N(0, \sigma^2) \ \ \ \ \ \ \ \ \text{Cov}(\gamma, \epsilon) = 0\]
The range of acceptable values is
\[(\alpha_\ast, \alpha_\ast + \beta)\]
\[
W_{ij} = \alpha_{T(i,j)} + \beta X_{ij} + \gamma_{i, T(i,j)} + \epsilon_{ij}\]
\[\gamma_{iX} \stackrel{\text{ i.i.d.}}{\sim} N(0, \eta_X^2) \ \ \ \ \ \ \ \ \gamma_{iY} \stackrel{\text{ i.i.d.}}{\sim} N(0, \eta_Y^2) \]
\[\epsilon_{ij} \stackrel{\text{ i.i.d.}}{\sim} N(0, \sigma^2) \ \ \ \ \ \ \ \ \text{Cov}(\gamma, \epsilon) = 0\]
We can compare this model to the psychophysics model using the midpoint of this interval, \[\alpha_\ast+\beta/2\]
| Transformation | Threshold | Estimate | 95% C.I. |
|---|---|---|---|
| X | Lower | 0.097 | (0.048, 0.149) |
| Upper | 0.625 | (0.570, 0.684) | |
| Y | Lower | 0.143 | (0.094, 0.187) |
| Upper | 0.671 | (0.622, 0.717) |
Either correction is preferrable to an uncorrected graph
Corrections do not have to be fully applied to break the illusion’s power
The sine illusion is strong enough to make participants think that lines of unequal length are equal
We can’t judge variability accurately when there is a nonlinear trend. Knowing is half the battle; having tools to screen for this effect could also be helpful.
R package with functions to correct data
Shiny applet that allows users to upload data and then provides \(x\) and \(y\) corrections
Understand what visual skills are related to comprehension of graphics
(Ideally) Link visual skills to specific types of graphics
Designed to test participants’ ability to find a target stimulus in a field of distractors
Participants are instructed to find the plot numbered 1-24 which matches the plot labeled “Target”. Participants will complete up to 25 of these tasks in 5 minutes
Tests participants’ ability to visualize and mentally manipulate figures in three dimensions. Associated with the ability to extrapolate symmetry and reflection over multiple steps.
Participants are instructed to pick the figure matching the sequence of steps shown in the left-hand figure. Participants will complete up to 20 of these tasks in 6 minutes.
Tests participant’s ability to rotate objects in two dimensions to distinguish between left-hand and right-hand versions of the same figure. Tests spatial reasoning ability and mental rotation skills.
Participants mark each figure on the right hand side as either the same or different than the figure on the left hand side of the dividing line. Participants will complete up to 20 of these tasks (each consisting of 8 figures) in 6 minutes.
This task is associated with visual reasoning capabilities and we expect that it should correlate with the ability to pick out a signal plot from a lineup.
Participants classify each figure in the second row as belonging to group 1, 2, or 3 (if applicable). Participants will complete up to 14 of these tasks (each consisting of 8 figures to classify) in 8 minutes.
Visual Search Task (25 questions)
Card Rotation Task (20 x 8 questions)
Figure Classification Task (14 x 8 questions)
Paper Folding Task (20 questions)
Control stimulus strength (i.e. \(r^2\), group size, point density, etc.) to effectively compare color to correlation
Appropriate color schemes to maximize sensory distance (while allowing colorblind viewers to perceive the groups)
Paper submitted to JCGS in Oct 2013, revision submitted in March 2014
Goal : Defend in January or February 2015